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Early detection of person-to-person transmission of emerging in- 
fectious diseases such as avian influenza is crucial for containing pan- 
demics. We developed a simple permutation test and its refined ver- 
sion for this purpose. A simulation study shows that the refined per- 
mutation test is as powerful as or outcompetes the conventional test 
built on asymptotic theory, especially when the sample size is small. 
In addition, our resampling methods can be applied to a broad range 
of problems where an asymptotic test is not available or fails. We 
also found that decent statistical power could be attained with just 
a small number of cases, if the disease is moderately transmissible 
between humans. 

1. Introduction. Most emerging infectious disease pathogens in humans 
cross from their natural zoonotic reservoir to human populations where early 
mutated, reassorted or recombined forms begin to spread from person-to- 
person [Antia et al. (2003)]. Examples include human immunodeficiency 
virus, monkey pox, severe acute respiratory syndrome and pandemic in- 
fluenza. Currently, a highly pathogenic avian influenza strain (H5N1) has 
been spreading from poultry to humans, mostly in Southeast Asia, with 
possible limited human-to-human spread through close contact in Indonesia 
[Butler (2006)]. A concern is that this virus could cause a large scale pan- 
demic as it becomes more adapted to human-to-human transmission. Real- 
time surveillance provides limited information on small clusters of human 
cases in terms of symptom onset times and physical location. It is critical to 
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answer two questions in real time: 1. Is the infectious agent spreading from 
person to person? and 2. If it is, how transmissible is it? The first question is 
novel and, to our knowledge, has not been addressed in the statistical liter- 
ature. The second question is an estimation problem, and various statistical 
methods using household data are applicable, such as the models based on 
observed final infection status [Longini and Koopman (1982), Becker and 
Hasofer (1997), O'Neill and Roberts (1999)] and those based on a discrete- 
time sequence of symptom onset [Rampey et al. (1992), Yang, Longini and 
Halloran (2006)]. Our major goal in this paper is to answer the first question, 
but an estimation method is needed for this goal. We base our approach on 
that in Yang, Longini and Halloran (2006). 

The statistical questions hinge on inference about the transmissibility of 
the infectious agent. The basic reproductive number, Rq, is the fundamental 
measure of the transmissibility of an emerging infectious agent. Given that 
the emerging infectious agent is transmissible, estimates of Ro will generally 
be small and are not very informative. In addition, estimation of some epi- 
demic characteristics such as secondary attack rates (SAR) and Rq heavily 
relies on the specification of a correct transmission model. When there is 
no person-to-person transmission, estimates of these characteristics may be 
nonzero, but are not meaningful. Therefore, a test of the existence of person- 
to-person transmission can provide a solid ground for parameter estimation. 
Specifically, one would like to test whether the person-to-person transmis- 
sion probability, no matter how it is defined, is 0. As a probability always 
takes values from to 1, the boundary value 0, which is a nonstandard 
condition, imposes an immediate challenge, because the null distribution of 
standard statistics, based on which tests are performed, are generally dif- 
ficult to track. Although statisticians have discussed asymptotic tests for 
a limited set of scenarios [Moran (1971), Self and Liang (1987), Feng and 
McCulloch (1992)], more often such an asymptotic null distribution is not 
available for a specific case. Furthermore, the validity of asymptotic tests 
depends on relatively large sample sizes, which may compromise the power 
of such tests to detect person-to-person transmission if applied to a small 
sample size, such as those generated by avian influenza. These challenges 
motivate our investigation in exact rather than asymptotic testing methods. 

2. Methods. The data structure we usually observe is a sequence of 
symptom onsets and associated cluster information, for example, at what 
time a symptom onset occurred in which household. To construct a proba- 
bility model with a reasonable level of complexity from the observed data, 
it is necessary to make basic assumptions about the natural history of the 
disease and the transmission mechanism. We assume that the incubation pe- 
riod is the same as the latent period, but other assumptions could be made 
about the relation of the two periods. We make the following additional 
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assumptions about the disease. Any newly infected person remains asymp- 
tomatic over a period of 5 days (the incubation period) before symptom 
onset, where 5 is a random quantity with a distribution of g(l) = Pr(<5 = 1), 
I = ^minj ^min + !,•••> <5max- We denote by <5 m i n and <5 max the minimum and 
maximum durations (in days) of the latent period. Upon symptom onset, the 
person becomes and remains infectious over a period of ij days (infectious pe- 
riod), where rj is also a random quantity with a distribution /(/) = Pr(r? = I), 
I = Vmm, Vmin + 1, ■ • ■ , ^max- Similarly, 

^min and f/max are the minimum and 
maximum durations of the infectious period. In this paper our method re- 
quires pre-specifying g(l) and f(l). 

We consider the dynamic of a community-based epidemic on a day-by- 
day basis. We assume that the whole community is exposed to some external 
source with a constant level of infectivity for S days. Such an external com- 
mon source takes into account all possible channels, such as exposure to 
infected animals, through which the disease can be introduced into the com- 
munity. Let b be the probability that a susceptible person in the community 
is infected by the common source during one day of exposure. The prob- 
ability of infection by the common source throughout the S-day exposure 
period is called the community probability of infection (CPI) and is given by 
1 — (1 — b) s [Longini and Koopman (1982)]. Once the disease is introduced 
into the community, transmission between people may occur through con- 
tacts. There are various types of contacts one can define. We define a contact 
as all possible interactions during one day that can potentially transmit the 
disease from an infective person to a susceptible person. We consider two 
levels of contacts: close contacts between two persons who live in the same 
household and casual contacts between two persons who live in different 
households but may make contact in the community. We denote by p\ the 
daily probability of transmission with a close contact, and by p2 with a 
casual contact. 

With the above setting, we can construct a likelihood and obtain the 
maximum likelihood estimates (MLEs) for the unknown parameters (b, p\ 
and P2) as given in the Appendix. Two quantities related to transmission 
probabilities that we would also like to estimate are the SAR and i?o- The 
SAR is defined as the probability of infection if a susceptible is exposed 
to an infective during his or her infectious period. Corresponding to the 
levels of contact, there are two types of SAR defined as SAR^ = J2i /(0(1 ~~ 
(1 — Pk) 1 ), k = 1,2. SARi is the SAR within households and is of more 
epidemiological interest than SAR2. The basic reproductive number refers to 
the expected number of people a typical infective person can infect among a 
large susceptible population. Here we are interested in the expected number 
of people that an infective person can infect given that he or she is the first 
infected person in this community. We refer to this as the local reproductive 
number R. Estimates of the local R cannot be generalized to a broader 
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context because of the potential selection bias. The clusters are often selected 
based on a number of cases and may represent higher Rq than in the general 
population. For a community of N households with a uniform household size 
M, we have R = (M - 1) x SARi + (N - M) x SAR 2 . 

Nonzero estimates of pi or p2 do not necessarily imply that their true val- 
ues are nonzero. In addition, construction of valid 95% confidence intervals 
for the estimates of transmission probabilities is difficult when their true 
values are O's. Therefore, a valid test of the hypothesis pi = P2 = would be 
of great public health interest. A formal statement of the hypothesis test is 

H :pi=p 2 = vs. 

Hi :pi > or p2> 0, 

where Ho is the null hypothesis and Hi is the alternative hypothesis. 
A natural choice of test statistic is the likelihood ratio statistic 

(1) A = -2 log- sup b L (b\ti,i = l,.-.,N) 



snp b!PuP2 L(b,pi,p 2 \ti,i = 1,...,NY 

where the numerator is the maximum likelihood (ML) when we restrict 
Pi = P2 = 0, and the denominator is the ML without such restriction, both 
conditioning on observed symptom onset times ti (ij = oo for uninfected in- 
dividuals). Explicit expression of the likelihoods are given in the Appendix. 
The likelihood ratio statistic asymptotically follows a Chi-square distribu- 
tion with 2 degrees of freedom when Ho is true, if all regularity conditions 
hold for this probability structure [Lehmann (1999)]. However, two nonstan- 
dard conditions are present in our case. One is that the hypothesized param- 
eter values under testing are boundary. As mentioned before, the asymptotic 
null distribution is generally difficult to track when boundary values are to 
be tested. Self and Liang (1987) discussed asymptotic distributions of the 
likelihood ratio statistic for some settings of boundary parameters, but our 
case is not one of them. The other nonstandard condition is that the param- 
eters to be tested affect the domain of observable data. When pi= P2 = 0, 
infections are confined to the S days with exposure to the common infective 
source. Therefore, no symptom onset can happen after day S + <5 max . When 
pi ^ or p2 7^ 0, the domain of the observable data is much larger. No valid 
asymptotic test exists when this nonstandard condition is present, unless 
we only use the data up to day S for testing at the price of losing some 
information. 

Resampling methods have been widely applied to hypothesis testing, espe- 
cially in the recent decade because of their easy implementation with modern 
computational capacity. While employing less stringent model assumptions, 
these methods can attain the same level of statistical power as standard 
tests [Hoeffding (1952), Box and Andersen (1955)]. Permutation tests (or 
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randomization tests) have been well developed in the setting of two-sample 
comparison and ANOVA [Fisher (1935), Pitman (1937), Welch (1990)]. For 
the boundary problem with parameter values specified by TCq, the bootstrap 
was used in combination with the likelihood ratio statistic to test the number 
of components in mixture models [McLachlan (1987), Feng and McCulloch 
(1996)]. We propose two approaches, a simple permutation test and a more 
refined one, for the problem of testing the person-to-person transmission 
probability. These resampling-based methods do not suffer from the two 
nonstandard conditions mentioned above, as shown by a simulation study. 
When the observed data are truly generated from H.q, we can reassign all 
of the observed symptom onset days (and associated infection status) to a 
different collection of individuals, and every such rearrangement is equally 
likely with the same likelihood Lq. The empirical distribution of the test 
statistic calculated from permuting symptom onset days across the popula- 
tion can then be used to approximate the null distribution under TCq. This 
simple permutation test can be refined by varying symptom onset days of 
infected individuals in any given permuted data while keeping the likelihood 
Lq under the null hypothesis unchanged. The refined permutation test re- 
samples data points from a much larger sampling space as compared to the 
simple permutation test. Technical details concerning development of the 
two resampling methods can be found in the Appendix. 

We first use simulations to verify the validity of the resampling methods 
by comparing them to the asymptotic test for a simpler scenario with only 
b and pi, that is, person-to-person transmission can only happen within 
households. For this two-parameter setting, Self and Liang (1987) showed 
that A will asymptotically follow a mixture distribution of Xo an d x\ with 
equal mixing probability. Only data up to day S are used for such comparison 
with the asymptotic test. We found that the refined permutation test has 
the best performance in terms of preserving type I error at the pre-specified 
level and yielding higher statistical power when population size and the 
number of cases are small. Results and discussion for the simple scenario are 
provided in the Appendix as well. Then we use simulations to investigate 
the performance of the refined permutation test for the scenario with three 
parameters: b, p\ and p2- 

Computing A involves calculating likelihoods under two different models, 
the one with restriction of parameters conforming to T~Lq is the null model, 
and the other one without any restriction is the full model. For a realized 
epidemic, one of the two models may not be admissible (or possible). For 
example, when the minimum interval between any pair of consecutive cases 
is larger than the maximum duration of the latent period, no infection can be 
possibly attributed to person-to-person infection; thus, only the null model 
is admissible. On the other hand, when there is any case on or after the 
day S + <5 maX ) where <5 max is the maximum duration of the latent period, 
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Fig. 1. ^4 sample epidemic curve for b = 0.001 , pi =0.014 and pi = 0.00005. Cases from 
the same household have the same color. 

only the full model is admissible because the common source is infective 
up to day S. When only the null (full) model is admissible, the p-value for 
that epidemic is assigned 1 (0). Resampling-based tests are performed only 
when both models are admissible. Checking admissibility can help avoid 
nonconvergence problems when maximizing likelihoods. 



3. Results. For simplicity, we simulate epidemics over a community com- 
posed of 100 households, each of size 5. We let the exposure to external 
common source last S = 30 days, and let the epidemic exhaust itself. We do 
not introduce initial cases to start the epidemic, but let the common source 
initiate infection. Simulation runs with zero infections were discarded. We 
simulate epidemics based on g(l) = |, I = 1, 2, 3, and f(l) = |, I = 3,4, 5, and 
these distribution are correctly specified by the methods that we evaluate. 
All p-values presented in this section are obtained by the refined permuta- 
tion test, but simulations show that the simple permutation method gives 
similar results under the same population and parameter settings as dis- 
cussed here, except that it tends to be too conservative about preserving 
type I error for extremely small b. 
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Table 1 

Power (x 100) to detect person-to-person transmission for different settings of b andpi, 

with p2 fixed at 0.00005 (SAR2 = 0.0002,). Numbers in parentheses are the average 
number of index cases over the average total number of cases. Results are based on 2000 
simulations. 2000 permuted samples were drawn for each permutation test 



pi 0.0002 0.0004 0.0006 0.0008 0.0010 0.0012 0.0014 0.0016 0.0018 0.0020 SARi R 



0.0 a 4.3(f) 

0.004 

0.006 

0.008 

0.010 

0.012 

0.014 

0.016 

0.018 

0.022 

0.026 

0.030 75(|) 
0.034 77(f) 



68(g) 75(f 



72(f) 
77(1) 
78(|) 



85( 1 

88(^; 



66(i| 

13 



62(f) 68(a) 71( | 
75(f) 79(f) 



Typify 

81(f) 87(if) 
84(f) 90(g) 
87(f) 



5.0(f) 5.1(f) 5.3(f) 4.2(f) 4.8(f) 4.9(f) 4.9(f) 5.0(f) 0.0 0.0 

62(f) 62(f) 67(f) 0.016 0.16 
72(f) 75(f) 79(f) 0.024 0.19 
0.032 0.23 
0.039 0.26 
0.047 0.29 
0.055 0.32 
0.062 0.35 
0.070 0.38 
0.085 0.44 
0.10 0.50 



, 84(gy 87(|) 
85(|) 90(f) 91(f) 
91(f) 93(f) 95(f) 



90(|f ) 92(^ 
95(|) 95(^ 



87(1 
92(| 
96(| 



0.038 80(f) 
0.042 84(f) 
0.046 86(f) 

CPI 0.006 0.012 0.018 0.024 0.030 0.035 0.041 0.047 0.053 0.058 



0.11 
0.13 
0.14 
0.16 
0.17 



0.56 
0.61 
0.67 
0.73 
0.78 



"The presented values are type I errors when p\ — P2 = 0.0. 



As P2 is of limited interest, we fix it at 0.00005 (SAR2 = 0.0002), and vary 
b from 0.0002 to 0.002 (CPI from 0.006 to 0.058) with a step of 0.0002. We 
vary p\ from 0.004 to 0.046 (SARi from 0.016 to 0.17) with steps chosen 
specific to b so as to yield power values in the range of (0.6, 1.0). All tests 
are performed at the level of 0.05, that is, we intend to have type I errors 
of no more than 5% when p\ = P2 = 0. An epidemic curve of a sample run 
for b = 0.001 (CPI = 0.03) and pi = 0.014 (SARi = 0.055) is displayed in 
Figure 1, with each block representing a symptomatic case. Cases from the 
same household are filled with the same color. A pattern is evident that cases 
in the same household tend to cluster together in time. The CPI, R and 
SAR given in the figure are based on the true parameters, but they could be 
estimated from the data as well. Results based on 2000 simulations and 2000 
permutations for each test are presented in Table 1. The first row where pi = 
P2 = gives type I errors for various values of 6, from which it is observed 
that type I errors are all preserved at the specified level. As expected, larger 
pi yields higher power for fixed b; similarly, larger b also yields higher power 
for any given p\. Surprisingly, when there are as few as a total of only seven 
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Fig. 2. Power to detect person-to-person transmission for different settings of b and p±, 
with p2 fixed at 0.00005. Results are based on 2000 simulations. 2000 permuted samples 
were drawn for each permutation test. The dashed line is the 80% power contour line 
obtained from Loess smoothing. 

cases, it is still possible to have 80% power with a moderate pi (SARi = 
0.14), which means that person-to-person transmission can still be detected 
even when there is a very limited number of cases. This finding could be very 
useful as most avian influenza epidemics in humans in recent years have a 
scale of eight total cases or fewer. Of interest as well is that all of the R 
values are below 1, as seen from the last column of Table 1. 

Figure 2 illustrates the information in Table 1, where power levels are 
shown in different colors and symbols with b and p\ as the horizontal and 
vertical axes, respectively. The 80% power contour curve obtained by Loess 
smoothing lies between green circles and red downward triangles. This figure 
clearly displays the trend of such a contour curve, descending sharply at b = 
0.0002 (CPI = 0.006) and becoming flat around Pl = 0.008 (SARi = 0.032) 
as b increases to 0.0014 (CPI = 0.041). Let denote the mean number 
of index cases and iVtot the mean total number of cases, averaging over all 
simulated epidemics. As only the number of cases are observable in real 
epidemics, we replace b and p\ with N^x and N tot as the axes in Figure 3. 
Not surprisingly, the underlying 80% power contour curve looks more linear, 
since roughly N tot f» (1 + R)N- 1( ^ X . While R depends on pi, the range of 1 + R 




Fig. 3. Power to detect person-to-person transmission plotted by the number of index 
cases versus the total number of cases. Results are based on 2000 simulations. 2000 per- 
muted samples were drawn for each permutation test. The dashed line is the 80% power 
contour line obtained from Loess smoothing. The solid line is the lower bound (0) of power, 
where the number of index cases equals the total number of cases. 



is relatively narrow, about [1.2, 1.3] at b > 0.0006, and becomes narrower as b 
increases. The figure also indicates that the power to detect person-to-person 
transmission is jointly determined by N^ x and iV to t, instead of either alone. 
We fitted a linear regression between the complementary log-log transformed 
power values and selected transformations of and -/V t ot, arid found the 
following empirical formula: 

Power = exp{- exp(1.29 + 0.75iV idx - 0.55iV tot - 1.401og(iV idx ))}, 

which explains 99% of the variation in power. Figure 4 plots the simulated 
vs. fitted power values, where most points fall close to the diagonal line, 
indicating that the empirical formula gives decent prediction, except for 
one point at b = 0.0002 and p\ = 0.03, where the predicted power, 0.71, is 
somewhat lower than the simulated power, 0.75. Such an empirical formula 
could be used to predict power levels at various values of iV tot and A^x 
for which simulations are not performed. The coefficients in the empirical 
formula will likely change for different parameter settings, and the linearity 
may not always hold. 
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Fig. 4. Plot of simulated and fitted values of power from the empirical formula 
Power = exp{-exp(1.29 + 0.75Af idx - 0.55iVtot - 1.401og(A r idx ))}. A good formula should 
have all the points falling close to the diagonal line. 
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Fig. 5. Trend of changes in power as mean duration of the latent period increases for 
different settings of b and p\ . Distributions of the latent period are uniform over three 
days and correctly specified in the models. Results are based on 2000 simulations. 2000 
permuted samples were drawn for each permutation test. 
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To investigate how sensitive the statistical power of the permutation test 
is to the distribution of the latent period, we vary the true mean duration 
from 1.5 to 14 days, while keeping g(l) a uniform distribution over three 
days. These distributions of the latent period are correctly specified in the 
models. We expect to see an increase in power, because increasing the latent 
period is essentially increasing the generation time between successive cases 
[Fine (2003)]. To look at the trend of changes in power when b is small, 
medium and large, simulations were done under three parameter settings: 
(6 = 0.0004 [CPI = 0.012], Pl = 0.014 [SARi = 0.055]), (6 = 0.001 [CPI = 
0.03], pi = 0.006 [SARi = 0.024]) and (b = 0.002 [CPI = 0.058], Pl = 0.004 
[SARi = 0.016]). The values of p\ are chosen to ensure that the initial power 
is below 0.8 and has the potential of reaching or exceeding 0.8. Results are 
displayed in Figure 5. Overall, power increases, and the rate of increment 
decreases, as the mean duration of the latent period (and thus the generation 
time) becomes longer. However, the rate of increment is higher at larger 
values of b, which means that the power of the refined permutation test is 
more sensitive to the distribution of the latent period when b is large. Such 
sensitivity does not compromise the usefulness of the permutation test, since 
our simulation study is performed under the setting with the minimum level 
of power. For avian influenza, the mean latent period may be as long as 14 
days, and the power will very likely be higher than in our simulation setting. 

4. Discussion. We have proposed a simple permutation method and its 
refined version to test the presence of person-to-person transmission within 
or between households. Using simulations, we have shown that the resam- 
pling methods are comparable to or outcompete the standard asymptotic 
testing method where such asymptotic method is applicable. More impor- 
tantly, the resampling methods remain valid in many settings where the 
asymptotic method is not applicable or not available yet. We have shown 
that, for an infectious disease with relatively rare incidence, person-to-person 
transmission could still be detected with decent power even if the total num- 
ber of cases is as few as seven or eight, given that the transmission prob- 
ability is high and the population is relatively large. We have studied the 
statistical power of the resampling methods under the model with two levels 
of contacts: within households and between households. The methods could 
be generalized to models with additional clustering groups such as schools 
and work places. 

We have assumed that the latent and incubation periods are identical and 
that the distributions of the latent and infectious periods are known. Other 
assumptions about the relation between the latent and incubation periods 
could be made, but may lead to different inference procedures and conclu- 
sions. As the presence of the infectious period implies nonzero transmission 
probabilities, the actual alternative hypothesis we are testing is p\ > or 
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P2 > and r] ~ /(/), that is, f(l) is a part of the parameters, but we fix it 
rather than estimate it. Estimating g{l) and f(l) solely from a sequence of 
symptom onsets is an ongoing research topic and is only practical for a rel- 
atively large number of cases [Wallinga (2004), Cauchemez (2006)]. To use 
our method in real epidemics, one could choose a range of plausible settings 
of g(l) and /(/), and any setting yielding a significant p-value is a warn- 
ing sign of transmission between human beings. Appropriate adjustment for 
multiple testing could be used, but one should be aware that these tests are 
highly correlated as they are essentially based on the same data set, and a 
Bonferroni-type adjustment is likely to be over-conservative. 

In our simulation study the likelihood is calculated up to day T — <5 max for 
subjects who do not show symptoms up to day T, an incomplete adjustment 
for right-censoring of infection status. A complete adjustment should take 
into account that infection might have occurred after T — <5 max and the latent 
period extends over T. Complete adjustments may be important for real- 
time analysis, especially when T S> 5 m ax does not hold. In our simulation 
setting, T S> <5 max approximately holds, and the bias in parameter estimates 
induced by right-censoring is minimal according to the simulation results in 
Yang, Longini and Halloran (2006). 

When conducting the test, maximum likelihood estimates of b, p\ and 
P2 are obtained. From these, estimates of other quantities such as the local 
reproductive number R and SAR can be derived. We note that, fixed at a 
value as small as 0.00005 (SAR2 = 0.0002), P2 is generally underestimated 
due to limited information and, consequently, R is also biased downward. 
Based on simulation results (not shown) , the bias decreases as the true value 
of P2 or size of the data increases. 

We have assumed that each susceptible individual is exposed to an ex- 
ternal common infectious source up to day S. One may argue that such 
exposure may only be reasonable for a subset of the population in some sit- 
uations. Our model can be applied to such situations as well, but only when 
there is no infected case in the subpopulation which is not exposed to the 
common source; otherwise, person-to-person transmission exists for sure. In 
addition, the exposure level to the common source can be assumed as varying 
from household to household, but permutation should be restricted within 
households and inference must be supported with sufficient data. 

In real epidemics, statistical inference may be very sensitive to the spec- 
ification of S. Particularly, mis-specifying a smaller value for S will likely 
increase the type I error, as cases that appear after S + <5 max must be ac- 
counted for by intensive person-to-person transmission. If no relevant infor- 
mation is available for determining S, assuming S >T will yield the most 
conservative p-value. Changing the value of S may affect the admissibility 
of models, depending on the specification of g(l) and /(/). To apply our 
methods, it is necessary to ensure that both the null and the alternative 
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models are admissible under these assumptions. Additionally, it may be dif- 
ficult to identify a clear cut point for the common source exposure, and how 
to impose the censoring mechanism on S without compromising the test 
performance is open to further research. 

Early detection of person-to-person transmission from limited data is cru- 
cial in containing pandemics of emerging infectious diseases such as avian 
influenza, and our work provides an effective tool for such evaluation. Our 
method requires not only a time sequence of symptom onsets, but also data 
on membership of households, whether or not they have cases. We believe 
that such data requirements are reasonable, and that the information could 
be collected by local health authorities. When only households with cases 
are available, selection bias needs to be addressed to make the test valid, 
which is a topic for further investigation. 



APPENDIX 

A.l. Statistical model. Assume that the epidemic starts on day 1 and 
stops by day T in a population of size N. Let ti be the symptom onset day 
for an infected person i. The probability that an infective family member j 
infects subject i on day t, given that subject i is not infected up through 
day t — 1, is expressed as 

(2) Pdt)= P [^ H M um) f(t-t J ), 

where /(•) is the indicator function (1: true, 0: false), Hi is the set of people 
in the same household with person i, and f(l) is the distribution of the 
infectious period. The probability that subject i escapes infection from all 
infective sources on day t, conditioning on that subject i is not infected up 
through day t — 1, is then given by 

N 

(3) e i (t) = (l-b) I ^l[[p ji . 

i=i 

Because the exact infection date is unobservable, we assume that the dura- 
tion of the latent period 5 is distributed as g(l) = Pr(<5 = I), I = 5 m - m , 5 ra \ Q + 
1, . . . , <5 max , so that we can construct a likelihood for person i as the following: 

Li{b,pi,p 2 \tj,j = 1,...,N) 

(4) 

T 

= < * =i 

t-x 

s t T=l 



not infected, 
otherwise. 
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The overall likelihood L(b,pi,p2\ii,i = 1,...,N) = YliLi(b,pi,p2\tj,j = 1, 
. . . , N) for the full model is maximized with respect to b, p\ and P2 to obtain 
the MLEs of the three parameters, and from these, the estimates of CPI, 
SARs and R. For notational convenience, we suppress the information about 
household membership that should appear behind the condition symbol in 
L. When there is no person-to-person transmission, that is, p\ =P2 = 0, (3) 
reduces to 

ei{t) = {1 - b y(t<S) 

Let Lo(b\ti,i = 1, . . . ,N) denote the likelihood for the null model. The test 
statistic is defined as in (1). 

A.2. Null distribution. 

A. 2.1. Resampling distribution. Consider the observed data set as a sam- 
ple point from the space of all possible infection status and symptom onset 
times that could occur based on the given population and parameter set- 
ting. There exists a class of sample points, which we refer to as the likelihood 
equivalence class, that have the same likelihood Lo(b\ti,i = 1, . . . ,N) as the 
observed data under the null hypothesis Tlo :p± =P2 = 0. If the null hypoth- 
esis is true, each sample point in the class occurs with equal probability. 
That is, if such a class is identifiable, we can obtain the null distribution 
of the test statistic by resampling sample points from the class with equal 
probability. Clearly, sample points obtained by permuting the observed in- 
fection status and associated symptom onset dates across the population 
belong to the likelihood equivalence class. Generally, the whole likelihood 
equivalence class is difficult to identify, and the use of permuted samples 
is straightforward and fruitful. Let (i[ , tjj , . . . , t ^ ) be the kth. permuted 
sample of (ti,i2, • • • ,iiv), and let A^ be the corresponding test statistic, 
k = 1, . . . ,M. Then the empirical distribution of A^ over all k can serve as 
the null distribution of A, and the p-value is given by -h J2k^(^ — A^). 

In our situation, however, it is possible to identify a subset of the likelihood 
equivalence class which is much larger than and that contains the permuted 
samples. The idea is more clearly illustrated in the situation without the 
latent period. Suppose that infection times are observable, and let ti denote 
the infection time instead of the symptom onset time for now. Then, the 
likelihood for the null model is given by 

L (b\t h i = 1, • • • , AO = II (! " b ) S x II ((! " 6 ) <V M 

( 5 ) - - ^ - - 

= (i_&)(*-^ s -*+£i 6 z>*&tf 
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where D is the set of N infected subjects and D the set of uninfected sub- 
jects. Therefore, one can randomly re-arrange the infection status and infec- 
tion times while keeping the likelihood value unchanged, as long as the sum 
of infection times (Y^ieD ^) an( ^ ^ ne number of infections (N) remain the 
same. Each re- arrangement is a sample point in the likelihood equivalence 
class. To keep N unchanged, a permutation of the infection and associated 
symptom status across the population would suffice, and we refer to it as 
the initial stage of the resampling procedure. The next stage, which we call 
the refinement stage, is to draw a sample point with equal probability from 
all possible distinct re-arrangements of infection times, given the infected 
cases are fixed. If the refinement stage is not carefully planned, the principle 
of equal probability can be easily violated, and the consequence is incor- 
rect type I error and/or insufficient statistical power. The problem can be 
re-stated as sampling with equal probability from all distinct arrangements 
of n balls (sum of infection times) into m boxes (infected cases), each box 
with a fixed volume of v (S). Let W(n,m,v) be the number of all possible 
distinct arrangements for such condition. This is a recursive system that can 
be solved by 

min(n,ti) 

(6) W(n,m,v)= Y~] W(n — k,m — l,v), 

k=0 

with the stopping rules W(n,0,v) = 0, W(0,m,v) = 1 and W{n, l,v) = 
I(n < v). An arrangement can be sampled with equal probability through 
the following procedure: 

1. Start with the box labeled i = 1, and there are N\ = n balls to be dis- 
tributed. 

2. In step i, let iVj be the number of balls not distributed yet. Randomly 
choose an integer ri{ from (0, 1, . . . ,r) according to the weights W(Ni — 
k,m — i,v), k = 0, 1, . . . ,r, where r = min(iVj, v), and assign balls to 
box i. Let = iVj — n^, and go to box i + 1. 

3. In the last step, distribute all the remaining N m balls to box m. 

N m will not exceed v for sure, because in step m — 1 any arrangement 
resulting in N m > v has a weight of and thus is excluded from sampling. 
Hence, this sampling procedure has the advantage of looping over all boxes 
only once. 

This sampling scheme can be adapted to situations with a latent pe- 
riod, but symptom onset times instead of infection times are subject to 
re-arrangements. The main deviation from the above ideal situation is that, 
because some cases may have special exposure history, re- arrangement of 
their symptom onset times will likely change the whole likelihood, and thus, 
they should be excluded from the refinement stage. One example is seen in 
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simulations, where we let the exposure to a common source of infection last 
from day 1 to day S, and let the latent period vary from 5 m i n to 5 max days. 
For any case i with symptom onset time ij > <5 max , there are <5 max — 5 m - m + 1 
days in which infection could happen, that is, any day between ti — <5 max 
and ti — <5 m i n . Symptom onset time of case i could be re-arranged from day 
<Wx + 1 to day S + <5 m i n without changing the likelihood of the null model, 
as long as the sum of symptom onset times are not changed. However, there 
may be cases with symptom onset between day <5 m i n + 1 and day <5 max , 
for whom the number of days in which infection could happen is less than 
<5max — flmin + 1. Re-arrangement of symptom onset times of these cases will 
very likely change the likelihood because the number of potential infection 
days will also change. Similarly, cases with symptom onset after day 5 , + <5 m i n 
should be excluded from the refinement stage as well. 

A. 2. 2. Asymptotic distribution. While the asymptotic null distribution 
of A is not readily available for testing 7i§:p\ = J>2 = 0, it is available for 
testing TIq :p\ = if we fix p2 = 0, that is, infection is only possible by the 
common source or within-household contacts. In this two-parameter setting, 
the escape probability for person i on day t given the existence of person- 
to-person transmission is 

e i (t) = (l-b) 1 ^ l[0-pif(t-ij)), 
and the test statistic is 

(7) A = -21og ^P b L (b\t u i = l,...,N) 

su Pfc, P i L(b,pi\U,i = 1, . . . ,N) 

Self and Liang (1987) showed that A ~ ^Xo + 5X1 under Hq :p\ = in such a 
model, where Xo is constant and xl is a Chi-square random variable with 
one degree of freedom. 

A. 3. Simulation study in the two-parameter setting. We compare the 
resampling test to the asymptotic test via a simulation study for the two- 
parameter setting. Only data observed up to day S, the last day of exposure 
to the common infective source, are used for testing to make the comparison 
fair, because the asymptotic test cannot handle data beyond day S + <5 max . 
The resampling method has two variations, one involving only the initial 
permutation stage, and the other having both stages. The former is referred 
to as the simple permutation test, which is widely applied to many problems; 
and the latter is called the refined permutation test in this paper to make a 
distinction between these two variations. We shall show through simulations 
that the refined permutation test has some advantages over both the simple 
permutation test and the asymptotic test for small sample sizes, and that 
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Table 2 

Comparison of type I error and power between the permutation test and the asymptotic 
test for models with only b and p\ . The community is composed of 4 households of size 5. 
Results are based on 5000 simulations. 2000 permuted samples were drawn for each test 



Simple Refined 



b 


CPI 


Pi a 


SARi 




iVtot c 


Asymptotic 


permutation 


permutation 


0.01 


0.26 


0.0 


0.0 


3 


5 


0.029 


0.039 


0.050 






0.02 


0.078 


3 


6 


0.21 


0.22 


0.26 






0.05 


0.18 


3 


8 


0.60 


0.57 


0.63 






0.08 


0.28 


3 


10 


0.85 


0.81 


0.85 


0.02 


0.45 


0.0 


0.0 


4 


9 


0.034 


0.046 


0.049 






0.02 


0.078 


4 


10 


0.21 


0.21 


0.24 






0.05 


0.18 


4 


12 


0.60 


0.54 


0.63 






0.08 


0.28 


4 


14 


0.87 


0.79 


0.87 


0.03 


0.6 


0.0 


0.0 


4 


11 


0.048 


0.049 


0.048 






0.02 


0.078 


4 


13 


0.18 


0.19 


0.22 






0.05 


0.18 


4 


15 


0.55 


0.48 


0.58 






0.08 


0.28 


4 


16 


0.80 


0.67 


0.81 



"Type I errors are reported when p\ — 0. 
i> A r id x is the average number of index cases. 
c N tot is the average total number of cases. 



the three tests tend to be equivalent for large sample sizes. By large sample 
size, we mean both a relatively large population and a large number of cases 
of the disease. 

We first present simulation results in Table 2 for a small population com- 
posed of 4 households, each of size 5. Values of b and p\ are chosen to cover 
a full range of statistical power levels. When p\ = 0, the reported values are 
type I errors. Clearly, the refined permutation test preserves type I error 
at the specified level of 0.05 for all settings of b. The asymptotic test is 
the most conservative in rejecting the true null hypothesis by having the 
smallest type I errors when there are 10 or fewer cases. Surprisingly, the 
simple permutation test is also conservative when there are only few cases, 
but less so than the asymptotic test. When b is as large as 0.03 (CPI = 0.6), 
all methods preserve type I error equally well. In terms of statistical power, 
the refined permutation test is superior to both of the other two methods. 
The simple permutation test, however, has the lowest power when there is 
a fair number of secondary (nonindex) cases, especially when both b and p\ 
are large. 

In Table 3 the population size is increased to 500 with 100 households. 
Similar to Table 3, we observe that the asymptotic test is conservative with 
the type I errors much lower than 0.05. When p\ is relatively small, that is, at 
the second row for each level of b, the asymptotic test is not as powerful as the 
resampling methods. The three methods tend to have the same performance 
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Table 3 

Comparison of type I error and power between the permutation test and the asymptotic 
test for models with only b and p\. The community is composed of 100 households of size 
5. Results are based on 2000 simulations. 2000 permuted samples were drawn for each 

test 



Simple Refined 



b 


CPI 


Pi a 


SARi 


N idx b 


iVtot c 


Asymptotic 


permutation 


permutation 


0.0005 


0.015 


0.0 


0.0 


7 


7 


0.037 


0.042 


0.046 






0.010 


0.039 


7 


8 


0.51 


0.52 


0.53 






0.020 


0.078 


7 


9 


0.78 


0.77 


0.78 






0.030 


0.11 


7 


10 


0.87 


0.86 


0.87 


0.0010 


0.03 


0.0 


0.0 


13 


14 


0.031 


0.047 


0.047 






0.010 


0.039 


13 


16 


0.59 


0.64 


0.64 






0.015 


0.059 


13 


17 


0.78 


0.81 


0.81 






0.020 


0.078 


13 


18 


0.88 


0.90 


0.90 


0.0050 


0.14 


0.0 


0.0 


51 


66 


0.037 


0.049 


0.053 






0.005 


0.020 


51 


69 


0.43 


0.45 


0.47 






0.010 


0.039 


51 


74 


0.85 


0.85 


0.86 






0.015 


0.059 


51 


78 


0.97 


0.97 


0.97 



"Type I errors are reported when p\ — 0. 
i> A r id x is the average number of index cases. 
c N to t is the average total number of cases. 



when pi increases. Again, the refined permutation method seems to be the 
best choice in these circumstances. 
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